Goto

Collaborating Authors

 distance metric


Learning semantic similarity in a continuous space

Neural Information Processing Systems

We address the problem of learning semantic representation of questions to measure similarity between pairs as a continuous distance metric. Our work naturally extends Word Mover's Distance (WMD) [1] by representing text documents as normal distributions instead of bags of embedded words. Our learned metric measures the dissimilarity between two questions as the minimum amount of distance the intent (hidden representation) of one question needs to travel to match the intent of another question. We first learn to repeat, reformulate questions to infer intents as normal distributions with a deep generative model [2] (variational auto encoder). Semantic similarity between pairs is then learned discriminatively as an optimal transport distance metric (Wasserstein 2) with our novel variational siamese framework. Among known models that can read sentences individually, our proposed framework achieves competitive results on Quora duplicate questions dataset. Our work sheds light on how deep generative models can approximate distributions (semantic representations) to effectively measure semantic similarity with meaningful distance metrics from Information Theory.









d61e9e58ae1058322bc169943b39f1d8-Paper.pdf

Neural Information Processing Systems

Setprediction tasksrequire thematching between predicted setandground truth set in order to propagate the gradient signal. Recent works have performed this matching in the original feature space thus requiring predefined distance functions.


Few-ShotNon-ParametricLearningwithDeepLatent VariableModel

Neural Information Processing Systems

By onlytraining agenerativemodel inanunsupervised way,theframeworkutilizes the data distribution to build a compressor. Using a compressor-based distance metric derived from Kolmogorov complexity, together with few labeled data, NPC-LVclassifies without further training.